Skip to content

[multiple]: Support agent-based BM SNO deployment#3739

Open
bogdando wants to merge 5 commits intoopenstack-k8s-operators:mainfrom
bogdando:dev
Open

[multiple]: Support agent-based BM SNO deployment#3739
bogdando wants to merge 5 commits intoopenstack-k8s-operators:mainfrom
bogdando:dev

Conversation

@bogdando
Copy link
Contributor

@bogdando bogdando commented Mar 4, 2026

Based on SNO pr #3129

  • Rebase after SNO pr merged

Allow deploying SNO OCP for RHOSO control plane instead of
the classic hybrid jobs approach.

Change the controller-0 which runs dev-scripts and deploy
architecture script to become the zuul controller node.

Skip libvirt/vmnet configuration and use no VMs at all.

Ditch dev-scripts and use agent-based openshift-installer
to also cover scenarios with isolated L2 domains between
zuul controller, SNO BM, and EDPM BM (will be added in
the future).

Allow to auto configure usb boot on the target SNO host,
and allow auto-discovery (or validation) of UEFI target
to boot from as Virtual Media Live CD. It is important
to make sure we boot from the image that we build as
we do not wipe the target host disks, and without
those guard rails it may result in confusing behavior
(booting from unexpected sources).

Allow live debug mode for agent appliance.

Password handling for agent aplliance and OCP:

  • Pre-ISO generation (for post-bootstrap):
    • MachineConfig 99-core-password.yaml -- sets password via MCO after
      cluster is up
  • Post-ISO generation (for discovery phase):
    • coreos-installer iso ignition show -- extracts the embedded
      ignition from the agent ISO
    • patch_ignition.py -- patches the ignition JSON to add
      passwordHash on the core user and a getty@tty1.service autologin
      drop-in
    • coreos-installer iso ignition embed -f -- re-embeds the patched
      ignition back into the ISO

Jira: OSPRH-26767
Generated-by: Cursor (claude-4.6-opus-high)
Signed-off-by: Bohdan Dobrelia bdobreli@redhat.com

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 4, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 4, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign dasm for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@bogdando bogdando force-pushed the dev branch 5 times, most recently from ba7f9a3 to 647266a Compare March 4, 2026 17:12
@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/9bd4674b8d4c4b52b1cd39592c1a29c8

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 15m 24s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 20m 36s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 35m 01s
✔️ cifmw-crc-podified-edpm-baremetal-minor-update SUCCESS in 2h 01m 18s
✔️ cifmw-pod-zuul-files SUCCESS in 5m 15s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 23m 16s
cifmw-pod-pre-commit FAILURE in 8m 34s
✔️ cifmw-molecule-devscripts SUCCESS in 10m 43s
✔️ cifmw-molecule-dnsmasq SUCCESS in 4m 39s
✔️ cifmw-molecule-libvirt_manager SUCCESS in 40m 33s
✔️ cifmw-molecule-reproducer SUCCESS in 15m 03s

@bogdando bogdando force-pushed the dev branch 7 times, most recently from 9858edf to cd5e154 Compare March 5, 2026 14:51
@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/c078ef19bb6347c28eeaf5336ae0bbe7

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 18m 59s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 23m 00s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 32m 32s
✔️ cifmw-crc-podified-edpm-baremetal-minor-update SUCCESS in 2h 04m 53s
✔️ cifmw-pod-zuul-files SUCCESS in 5m 07s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 9m 07s
cifmw-pod-pre-commit FAILURE in 8m 24s
✔️ cifmw-molecule-devscripts SUCCESS in 11m 29s
✔️ cifmw-molecule-dnsmasq SUCCESS in 4m 39s
✔️ cifmw-molecule-libvirt_manager SUCCESS in 40m 04s
✔️ cifmw-molecule-reproducer SUCCESS in 14m 49s

@bogdando bogdando force-pushed the dev branch 2 times, most recently from b359870 to b15906d Compare March 6, 2026 11:05
@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/24c52f4231c343c5b77791ceeb4a0b06

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 14m 54s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 21m 32s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 33m 05s
✔️ cifmw-crc-podified-edpm-baremetal-minor-update SUCCESS in 2h 01m 27s
✔️ cifmw-pod-zuul-files SUCCESS in 5m 08s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 8m 49s
cifmw-pod-pre-commit FAILURE in 8m 31s
✔️ cifmw-molecule-devscripts SUCCESS in 11m 48s
✔️ cifmw-molecule-dnsmasq SUCCESS in 4m 32s
✔️ cifmw-molecule-libvirt_manager SUCCESS in 39m 38s
✔️ cifmw-molecule-reproducer SUCCESS in 15m 13s

@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/302bb0bb5235426b982f4db10356cc88

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 05m 34s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 23m 02s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 28m 54s
✔️ cifmw-crc-podified-edpm-baremetal-minor-update SUCCESS in 1h 52m 26s
✔️ cifmw-pod-zuul-files SUCCESS in 25m 39s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 8m 47s
cifmw-pod-pre-commit FAILURE in 7m 53s
✔️ cifmw-molecule-devscripts SUCCESS in 10m 16s
✔️ cifmw-molecule-dnsmasq SUCCESS in 4m 47s
✔️ cifmw-molecule-libvirt_manager SUCCESS in 39m 59s
✔️ cifmw-molecule-reproducer SUCCESS in 15m 23s

@bogdando bogdando force-pushed the dev branch 3 times, most recently from 1a09008 to 5a68d1a Compare March 9, 2026 12:06
@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/0aefcac6cf914cf1838e68de6b58b488

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 36m 50s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 18m 36s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 24m 52s
cifmw-crc-podified-edpm-baremetal-minor-update FAILURE in 46m 01s
✔️ cifmw-pod-zuul-files SUCCESS in 4m 12s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 9m 06s
cifmw-pod-pre-commit FAILURE in 8m 27s
✔️ cifmw-molecule-devscripts SUCCESS in 10m 51s
✔️ cifmw-molecule-dnsmasq SUCCESS in 4m 51s
✔️ cifmw-molecule-libvirt_manager SUCCESS in 40m 35s
✔️ cifmw-molecule-reproducer SUCCESS in 14m 06s

@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/b103a318d7a84c699c19ccb2c9b8f7d8

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 08m 24s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 15m 28s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 36m 01s
✔️ cifmw-crc-podified-edpm-baremetal-minor-update SUCCESS in 1h 55m 33s
✔️ cifmw-pod-zuul-files SUCCESS in 4m 47s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 9m 15s
cifmw-pod-pre-commit FAILURE in 8m 39s
✔️ cifmw-molecule-devscripts SUCCESS in 10m 55s
✔️ cifmw-molecule-dnsmasq SUCCESS in 4m 42s
✔️ cifmw-molecule-libvirt_manager SUCCESS in 40m 55s
✔️ cifmw-molecule-reproducer SUCCESS in 16m 17s

@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/7afeaad5ed58496e8faf2a68640b4234

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 13m 51s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 22m 10s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 23m 58s
✔️ cifmw-crc-podified-edpm-baremetal-minor-update SUCCESS in 2h 00m 55s
✔️ cifmw-pod-zuul-files SUCCESS in 5m 32s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 9m 11s
cifmw-pod-pre-commit FAILURE in 10m 03s
✔️ cifmw-molecule-devscripts SUCCESS in 10m 31s
✔️ cifmw-molecule-dnsmasq SUCCESS in 5m 02s
✔️ cifmw-molecule-libvirt_manager SUCCESS in 40m 59s
✔️ cifmw-molecule-reproducer SUCCESS in 15m 15s

@bogdando bogdando force-pushed the dev branch 2 times, most recently from 03404f0 to a7c97c5 Compare March 13, 2026 10:31
@bogdando bogdando requested a review from danpawlik March 13, 2026 10:31
@bogdando
Copy link
Contributor Author

I tested it downsrteam

@bogdando
Copy link
Contributor Author

Note, this needs #3129 to land first

@bogdando bogdando requested review from abays, cjeanner and hjensas March 13, 2026 10:35
@bogdando bogdando force-pushed the dev branch 2 times, most recently from 870a219 to cc7d7f5 Compare March 13, 2026 11:00
@bogdando bogdando changed the title SNO BM support for hybrid CI jobs [multiple]: Support agent-based BM SNO deployment Mar 13, 2026
@bogdando
Copy link
Contributor Author

bogdando commented Mar 13, 2026

Found issue during testing this again

UPDATE: resolved with the top commit

@bogdando bogdando force-pushed the dev branch 3 times, most recently from 8f245fe to 22c86ac Compare March 16, 2026 10:13
@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/91484a74fcb2454e92181cad2382c363

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 07m 33s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 20m 34s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 23m 11s
✔️ cifmw-crc-podified-edpm-baremetal-minor-update SUCCESS in 1h 55m 13s
✔️ cifmw-pod-zuul-files SUCCESS in 4m 42s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 8m 48s
cifmw-pod-pre-commit FAILURE in 8m 16s
✔️ cifmw-molecule-devscripts SUCCESS in 10m 50s
✔️ cifmw-molecule-dnsmasq SUCCESS in 4m 40s
✔️ cifmw-molecule-libvirt_manager SUCCESS in 40m 59s
✔️ cifmw-molecule-reproducer SUCCESS in 17m 04s

danpawlik and others added 4 commits March 18, 2026 17:22
This commit allows reproducer to create OpenShift cluster
using Single Node OpenShift (SNO) feature.

Signed-off-by: Daniel Pawlik <dpawlik@redhat.com>
Replace the hardcoded value with a variable.
This value may need changes for SNO BM cases.

Signed-off-by: Bohdan Dobrelia <bdobreli@redhat.com>
Allow deploying SNO OCP for RHOSO control plane instead of
the classic hybrid jobs approach.

Change the controller-0 which runs dev-scripts and deploy
architecture script to become the zuul controller node.

Skip libvirt/vmnet configuration and use no VMs at all.

Ditch dev-scripts and use agent-based openshift-installer
to also cover scenarios with isolated L2 domains between
zuul controller, SNO BM, and EDPM BM (will be added in
the future).

Allow to auto configure usb boot on the target SNO host,
and allow auto-discovery (or validation) of UEFI target
to boot from as Virtual Media Live CD. It is important
to make sure we boot from the image that we build as
we do not wipe the target host disks, and without
those guard rails it may result in confusing behavior
(booting from unexpected sources).

Allow live debug mode for agent appliance.

Password handling for agent aplliance and OCP:
* Pre-ISO generation (for post-bootstrap):
  - MachineConfig 99-core-password.yaml -- sets password via MCO after
    cluster is up
* Post-ISO generation (for discovery phase):
  - coreos-installer iso ignition show -- extracts the embedded
    ignition from the agent ISO
  - patch_ignition.py -- patches the ignition JSON to add
    passwordHash on the core user and a getty@tty1.service autologin
    drop-in
  - coreos-installer iso ignition embed -f -- re-embeds the patched
    ignition back into the ISO

Generated-by: Cursor (claude-4.6-opus-high)
Signed-off-by: Bohdan Dobrelia <bdobreli@redhat.com>
Fix bindep-install script linting erors

Signed-off-by: Bohdan Dobrelia <bdobreli@redhat.com>
@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/b4d077a140bf4218a1095f2fc8dc00ea

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 21m 30s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 22m 29s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 22m 57s
✔️ cifmw-crc-podified-edpm-baremetal-minor-update SUCCESS in 2h 07m 56s
cifmw-pod-zuul-files FAILURE in 4m 45s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 8m 42s
✔️ cifmw-pod-pre-commit SUCCESS in 8m 14s
✔️ cifmw-molecule-devscripts SUCCESS in 10m 46s
✔️ cifmw-molecule-dnsmasq SUCCESS in 4m 54s
✔️ cifmw-molecule-libvirt_manager SUCCESS in 40m 37s
✔️ cifmw-molecule-reproducer SUCCESS in 1h 04m 11s

@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/5cac6ab3cfcd4e15b37e3b68d4cc6604

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 07m 31s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 23m 42s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 19m 35s
✔️ cifmw-crc-podified-edpm-baremetal-minor-update SUCCESS in 1h 53m 08s
✔️ cifmw-pod-zuul-files SUCCESS in 6m 38s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 10m 05s
cifmw-pod-pre-commit FAILURE in 8m 49s
cifmw-molecule-bm_sno FAILURE in 3m 18s
✔️ cifmw-molecule-devscripts SUCCESS in 10m 46s
✔️ cifmw-molecule-dnsmasq SUCCESS in 5m 18s
✔️ cifmw-molecule-libvirt_manager SUCCESS in 42m 56s
✔️ cifmw-molecule-reproducer SUCCESS in 15m 55s

@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/950bcf1a311d48dab6e32af6c1348b68

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 22m 31s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 22m 18s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 27m 09s
✔️ cifmw-crc-podified-edpm-baremetal-minor-update SUCCESS in 2h 07m 39s
✔️ cifmw-pod-zuul-files SUCCESS in 5m 17s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 8m 36s
cifmw-pod-pre-commit FAILURE in 8m 04s
cifmw-molecule-bm_sno FAILURE in 3m 15s
✔️ cifmw-molecule-devscripts SUCCESS in 10m 40s
✔️ cifmw-molecule-dnsmasq SUCCESS in 4m 44s
✔️ cifmw-molecule-libvirt_manager SUCCESS in 41m 39s
✔️ cifmw-molecule-reproducer SUCCESS in 15m 05s

Rename:
cifmw_reproducer_bm_ocp -> cifmw_bm_sno
cifmw_devscripts_bm_nodes -> cifmw_bm_nodes

Change defaults:
openshift version, and auto-enable usb boot on target server BIOS.

Also extract injection into a separate task, and cover with tests.
Make sure no creds are leaking.
Fix ejectinig already inserted image.

On iDRAC 9 (fw 4.x), EjectMedia sets Inserted=false but the Image URL
and internal Remote File Share connection linger indefinitely. Redfish
PATCH on VirtualMedia/CD returns 405 (only GET,HEAD allowed), and no
amount of waiting releases the stale RFS -- InsertMedia keeps failing
with "already connected" (RH BZ#1910739).

Work around this iDRAC limitation by SSH-ing into the BMC and running
racadm directly, when Image persists after the Redfish eject.

Generated-by: claude-4.6-opus-high
Signed-off-by: Bohdan Dobrelia <bdobreli@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants